Text Summarisation based on Human Language Technologies and its Applications
نویسنده
چکیده
The research work carried out in this thesis focuses on Text Summarisation, proposing and developing compendium Text Summarisation tool. This tool takes into account the cognitive perspective, that provides insights of how humans summarise, as well as computational issues needed for its automation. For evaluating compendium, we selected different corpora belonging to a wide range of domains and textual genres. Moreover, we also performed an extrinsic evaluation, applying compendium to three Human Languages Technologies tasks: question answering, opinion minng and text classification. The results obtained show that the generated summaries are very appropriate both for individual users as well as for other Human Language Technologies applications.
منابع مشابه
Dense Semantic Graph and its Application in Single Document Summarisation
Semantic graph representation of text is an important part of natural language processing applications such as text summarisation. We have studied two ways of constructing the semantic graph of a document from dependency parsing of its sentences. The first graph is derived from the subject-object-verb representation of sentence, and the second graph is derived from considering more dependency r...
متن کاملGLEU: Automatic Evaluation of Sentence-Level Fluency
In evaluating the output of language technology applications—MT, natural language generation, summarisation—automatic evaluation techniques generally conflate measurement of faithfulness to source content with fluency of the resulting text. In this paper we develop an automatic evaluation metric to estimate fluency alone, by examining the use of parser outputs as metrics, and show that they cor...
متن کاملAutomated text summarisation and evidence-based medicine: A survey of two domains
The practice of evidence-based medicine (EBM) urges medical practitioners to utilise the latest research evidence when making clinical decisions. Because of the massive and growing volume of published research on various medical topics, practitioners often find themselves overloaded with information. As such, natural language processing research has recently commenced exploring techniques for p...
متن کاملSalience-Based Content Characterisation Of Text Documents
Summarisation is poised to become a generally accepted solution to the larger problem of content analysis. We offer an alternative perspective on this problem, by tackling the complementary task of content characterisation; our motivation for doing so is to avoid some of the fundamental shortcomings of summarisation technologies today. Traditionally, the document summarisation task has been tac...
متن کاملSoftware Infrastructure for Natural Language
We classify and review current approaches to software infrastructure for research, development and delivery of NLP systems. The task is motivated by a discussion of current trends in the eld of NLP and Language Engineering. We describe a system called GATE (a General Architecture for Text Engineering) that provides a software infrastructure on top of which heterogeneous NLP processing modules m...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Procesamiento del Lenguaje Natural
دوره 48 شماره
صفحات -
تاریخ انتشار 2012